Automatic extraction of differences between spoken and written languages, and automatic translation from the written to the spoken language
نویسندگان
چکیده
We extracted the di erences between spoken language and written language from a spoken-language corpus and a writtenlanguage corpus by using the UNIX command \di " and examined the di erences to determine the construction of the grammars of the two corpora. We also transformed written-language sentences into spoken-language sentences by using rules based on the extracted di erences.
منابع مشابه
Semantic processing survey of spoken and written words in adolescents with cerebral palsy: Evidence from PALPA word-picture matching test
Objective: The present study aimed to assess and compare semantic processing of spoken and written words in adolescents with cerebral palsy and healthy adolescents. Method: The present study is quantitative in terms of type and experimental in terms of method. Examination Group consisted 30 adolescents with cerebral palsy aged 10 to 15 years were selected by convenience sampling method. All of ...
متن کاملThe Correlation of Machine Translation Evaluation Metrics with Human Judgement on Persian Language
Machine Translation Evaluation Metrics (MTEMs) are the central core of Machine Translation (MT) engines as they are developed based on frequent evaluation. Although MTEMs are widespread today, their validity and quality for many languages is still under question. The aim of this research study was to examine the validity and assess the quality of MTEMs from Lexical Similarity set on machine tra...
متن کاملSpoken Language Translation With The ITSVox System
This paper describes the ITSVox speech-to-speech translation prototype currently under development at LATL in collaboration with IDIAP. The ITSVox project aims at a general, interactive, multimodal translation system with the following characterics : (i) it is not restricted to a particular subdomain, (ii) it can be used either as a fully automatic system or as an interactive system, (iii I it ...
متن کاملRobust Extraction of Subcategorization Data from Spoken Language
Subcategorization data has been crucial for various NLP tasks. Current method for automatic SCF acquisition usually proceeds in two steps: first, generate all SCF cues from a corpus using a parser, and then filter out spurious SCF cues with statistical tests. Previous studies on SCF acquisition have worked mainly with written texts; spoken corpora have received little attention. Transcripts of ...
متن کاملTranscribing human-directed speech for spoken language processing
As storage costs drop and bandwidth increases, there has been a rapid growth of spoken information available via the web or in online archives, raising problems of document retrieval, information extraction, summarization and translation for spoken language. While there is a long tradition of research in these technologies for text, new challenges arise when moving from written to spoken langua...
متن کاملذخیره در منابع من
با ذخیره ی این منبع در منابع من، دسترسی به آن را برای استفاده های بعدی آسان تر کنید
عنوان ژورنال:
دوره شماره
صفحات -
تاریخ انتشار 2002